Capturing protein multiscale thermal fluctuations
نویسندگان
چکیده
Existing elastic network models are typically parametrized at a given cutoff distance and often fail to properly predict the thermal fluctuation of many macromolecules that involve multiple characteristic length scales. We introduce a multiscale flexibility-rigidity index (mFRI) method to resolve this problem. The proposed mFRI utilizes two or three correlation kernels parametrized at different length scales to capture protein interactions at corresponding scales. It is about 20% more accurate than the Gaussian network model (GNM) in the B-factor prediction of a set of 364 proteins. Additionally, the present method is able to delivery accurate predictions for multiscale macromolecules that fail GNM. Finally, or a protein of N residues, mFRI is of linear scaling (O(N)) in computational complexity, in contrast to the order of O(N) for GNM. Proteins are among the most essential biomolecules for life. Many protein functions, such as structure support, catalyzing chemical reactions, and allosteric regulation are strongly correlated to protein flexibility.13 Protein flexibility is an intrinsic property of proteins and can be measured directly or indirectly by many experimental approaches, such as X-ray crystallography, nuclear magnetic resonance (NMR) and single-molecule force experiments.9 Theoretically, protein flexibility can be computed by normal mode analysis (NMA),6,14,22,32 graph theory,18 rotation translation blocks (RTB) method,8,30 and elastic network model (ENM),3–5,15,23,31 including Gaussian network model (GNM)4,5 and anisotropic network model (ANM).3 A common feature of the above mentioned time-independent methods is that they resort to the matrix diagonalization procedure. The computational complexity of the matrix diagonalization is typically of the order of O(N), where N is the number of elements in the matrix. Such a computational complexity calls for new efficient strategies for the flexibility analysis of large biomolcules. It is well known that NMA and GNM do not work well for many macromolecules. Park et al. had collected three sets of structures to test performance of NMA and GNM methods.26 It was found that both methods fail to work and deliver negative correlation coefficients for many structures.26 The mean correlation coefficients (MCCs) for the B-factor prediction of small-sized, medium-sized and large-sized sets of structures are about 0.480, 0.482 and 0.494 for NMA, respectively.25,26 The GNM preforms slightly better, with the mean correlation coefficients of 0.541, 0.550 and 0.529 for the above test sets.25,26 Obviously, there is a pressing need to develop innovative approaches for biomolecular flexibility analysis. Recently, we have proposed a few matrix-decomposition-free methods for flexibility analysis, including molecular nonlinear dynamics,36 stochastic dynamics35 and flexibility-rigidity index (FRI).25,34 Among them, flexibilityrigidity index (FRI) has been introduced to evaluate protein flexibility and rigidity. The fundamental assumptions of the FRI method are as follows. Protein functions, such as flexibility, rigidity, and energy, are fully determined by the structure of the protein and its environment, and the protein structure is in turn determined by the relavent interactions. Therefore, whenever the protein structure is available, there is no need to analyze protein flexibility and rigidity by tracing back to the protein interaction Hamiltonian. Consequently, the FRI bypasses the O(N) ∗Address correspondences to Guo-Wei Wei. E-mail:[email protected] 1 ar X iv :1 50 5. 05 09 6v 2 [ qbi o. B M ] 2 0 M ay 2 01 5 matrix diagonalization. Our initial FRI34 has the computational complexity of of O(N) and our fast FRI (fFRI)25 based on a cell lists algorithm2 is of O(N). The FRI and the fFRI have been extensively validated by a set of 365 proteins for parametrization, accuracy and reliability. The parameter free fFRI is about ten percent more accurate than the GNM on the 365 protein test set and is orders of magnitude faster than GNM on a set of 44 proteins. FRI is able to predict the B-factors of an HIV virus capsid (313 236 residues) in less than 30 seconds on a single-core processor, which would require GNM more than 120 years to accomplish if the computer memory is not a problem.25 See the supplementary material for detail. Nevertheless, there are structures for which FRI does not work either. In fact, for those structures that fail NMA and GNM are likely to be difficult for FRI as well. One such structure is pictured in Figure 2 where the GNM method fails to predict the high flexibility of a hinge region in calmodulin with any cutoff distance. There are a number of reasons for this and other types of failure. Crystal environment, solvent type, co-factors, data collection conditions, and structural refinement procedures are well-known causes.16,20,21,29 However, there is one more important cause that has not been discussed in the literature to our best knowledge, namely, multiple characteristic length scales in a single protein structure. Indeed, contrary to small molecules, macromolecular interactions have a wide variety of characteristic length scales, ranging from covalent bond, hydrogen bond, wan der Waals bond, residue, alpha helix and beta sheet, domain and protein scales. Protein flexibility is intrinsically associated with protein interactions, and thus must have a multiscale trait as well. When GNM or FRI method is parametrized at a given cutoff or scale parameter, it captures only a subset of the characteristic length scales but inevitably misses other characteristic length scales of the protein. Consequently, none of them is able to provide an accurate B-factor prediction. Multiscale flexibility-rigidity index (mFRI) is constructed to capture the multiscale collective motions of macromolecules. We utilize multiple correlation kernels, with each kernel being parametrized at specific scale to characterize the multiscale flexibility of macromolecules. The nth flexibility index of the ith (coarse-grained) particle is given by f i = 1 ∑N j=1 w n j Φ (‖ri − rj‖; ηn j ) , (1) where w j is an atomic type dependent parameter, Φ (‖ri − rj‖; η j ) is a correlation kernel and η j is a scale parameter. Here ri and rj are the coordinates for ith and jth particles, respectively. We seek the minimization of the form Minan,b ∑ i ∣∣∣∣∑ n af i + b−B i ∣∣∣∣ 2 (2) where {B i } are the experimental B-factors. We use generalized exponential kernels25,34 Φ(‖r− rj‖; η j ) = e−(‖r−rj‖/η n j ) κ , κ > 0 (3) and generalized Lorentz kernels Φ(‖r− rj‖; η j ) = 1 1 + ( ‖r− rj‖/η j )υ , υ > 0. (4) In principle, all parameters can be optimized. For simplicity and computational efficiency, we only determine {a} and b in the above minimization process. In this work, we limit the number of kernels to at most there and set w j = 1. Both generalized exponential kernels and generalized Lorentz kernels are employed. More detailed description of the mFRI is given in the supplementary material.
منابع مشابه
Communication: Capturing protein multiscale thermal fluctuations.
Existing elastic network models are typically parametrized at a given cutoff distance and often fail to properly predict the thermal fluctuation of many macromolecules that involve multiple characteristic length scales. We introduce a multiscale flexibility-rigidity index (mFRI) method to resolve this problem. The proposed mFRI utilizes two or three correlation kernels parametrized at different...
متن کاملAlgorithm Refinement for Fluctuating Hydrodynamics
This paper introduces an adaptive mesh and algorithm refinement method for fluctuating hydrodynamics. This particle-continuum hybrid simulates the dynamics of a compressible fluid with thermal fluctuations. The particle al-
متن کاملOn the characterization of protein native state ensembles.
Describing and understanding the biological function of a protein requires a detailed structural and thermodynamic description of the protein's native state ensemble. Obtaining such a description often involves characterizing equilibrium fluctuations that occur beyond the nanosecond timescale. Capturing such fluctuations remains nontrivial even for very long molecular dynamics and Monte Carlo s...
متن کاملA Renormalization Approach to Simulations of Quantum Effects in Nanoscale Magnetic Systems
Simulation of the dynamics of nanoscale systems are usually restricted to a purely classical description, which becomes inadequate once technological minimization reaches scales on which quantum mechanical effects become relevant. In this paper we propose a multiscale approach to estimate quantum corrections to such classical descriptions. Quantum fluctuations in a quantum magnet are absorbed i...
متن کاملFluctuations in the heterogeneous multiscale methods for fast–slow systems
How heterogeneous multiscale methods (HMM) handle fluctuations acting on the slow variables in fast–slow systems is investigated. In particular, it is shown via analysis of central limit theorem (CLT) and large deviation principle (LDP) that the standard version of HMM artificially amplifies these fluctuations. A simple modification of HMM, termed parallel HMM, is introduced and is shown to rem...
متن کاملA Hybrid Particle-Continuum Method for Hydrodynamics of Complex Fluids
A previously-developed hybrid particle-continuum method [J. B. Bell, A. Garcia and S. A. Williams, SIAM Multiscale Modeling and Simulation, 6:1256-1280, 2008 ] is generalized to dense fluids and two and three dimensional flows. The scheme couples an explicit fluctuating compressible Navier-Stokes solver with the Isotropic Direct Simulation Monte Carlo (DSMC) particle method [A. Donev and A. L. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015